Picture for Juanzi Li

Juanzi Li

MM-THEBench: Do Reasoning MLLMs Think Reasonably?

Add code
Jan 30, 2026
Viaarxiv icon

On the Paradoxical Interference between Instruction-Following and Task Solving

Add code
Jan 29, 2026
Viaarxiv icon

RPC-Bench: A Fine-grained Benchmark for Research Paper Comprehension

Add code
Jan 14, 2026
Viaarxiv icon

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Add code
Jan 09, 2026
Viaarxiv icon

WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection

Add code
Oct 21, 2025
Viaarxiv icon

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Add code
Oct 02, 2025
Viaarxiv icon

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Add code
Jun 23, 2025
Viaarxiv icon

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Add code
Jun 11, 2025
Viaarxiv icon